gh-73936: Add hashlib.saslprep (RFC 4013) and use it in smtplib for Unicode credentials#148692
gh-73936: Add hashlib.saslprep (RFC 4013) and use it in smtplib for Unicode credentials#148692blink1073 wants to merge 3 commits intopython:mainfrom
Conversation
…e credentials Co-authored-by: Arnt Gulbrandsen <arnt@gulbrandsen.priv.no> Co-authored-by: Bernie Hackett <bernie.hackett@gmail.com>
|
Kicking to re-run the CLA bot |
|
I am not really fond of having it in hashlib. It has nothing to do with hashing and cryptographic primitives. Unfortunately having a new module requires a PEP usually. I also do not think it is right to change smtplib in the same PR. I am on mobile so hard to check, but what RFC are we guaranteeing to follow for SMTP? |
This is a good question. Supporting all the various RFCs related to email may just be too much for the stdlib, depending on whether we have a dedicated maintainer for it. We split out server-side support into I'm not really involved in any of this any more, so I don't really have a say. Aside: I still think it makes sense to have a standards-compliant email parsing library in the stdlib. |
|
I would rather prefer the following:
|
|
The motivating issue asked to support unicode passwords in smtp. #103611 proposed using I added it to It could be that there is a simpler approach to adding unicode password support using rfc6531. My main intent was to help carry the work from #103611 across the finish line, since I am a maintainer of the |
|
Well, I'm more or less once again the maintainer for smtplib after a long absence, though my primary focus is the email library. Right now I'm mostly focused there, rewriting the header parser to deal with a security-adjacent performance issue, so my head is currently loaded with the email RFCs, not the SMTP RFCs :) I don't think RFC 6531 helps here; at least, a search for 'pass' does not get any hits. I think it is focused on the data, not on the auth mechanisms, but it has been a while since I scanned it, much less read it. The issue you picked up from mentions saslprep being required by the RFC. It would be interesting to have a direct reference for that. I think my comments on the original issue are relevant here: if we change smtplib so that you can at least pass binary passwords, then the user can at least implement what they personally need, even if smtplib doesn't directly support saslprep. I think, since it is a standard and would directly enhance smtplib, imaplib, and poplib (if I understand correctly), supporting saslprep in the stdlib would be nice, and the contribution of the code is great, but we do probably want some sort of maintenance commitment as well? And then there is the question of where to put it, since it would be shared by all the email client code. So unfortunately a PEP might be needed; I'm not entirely clear on current procedures, hopefully Barry can speak to that. I think there is other auth code shared between those modules (or that could be shared) that could go in a common location, but I'd have to refresh my memory on those modules before I could say for sure. So I see two orthogonal tasks here, and we could decide we should do either or both: we can get the possibility of non-ascii passwords working by allowing bytes passwords, and we can implement RFC compliant support for non-ascii unicode passwords, which is the much bigger job that this PR is focused on. I don't think there's currently any PR for the first one :) Oh, another thought on location for saslprep: in an ideal world we might reorganize the stdlib so that poplib, imaplib, and smtplib were all under the 'email' package. In that case saslprep would also go there. So maybe we just wink and put it there? Maybe we could avoid having to do a PEP that way? Or should we do one anyway? |
This is a continuation of #103611, updated to address all open reviewer comments.
Changes
Lib/_saslprep.py(new): RFC 4013 SASLprep implementation, adapted fromthe PyMongo project with
permission. Apache 2.0 licence header retained; MongoDB corporate CLA is
signed.
Lib/hashlib.py: exposessaslprepashashlib.saslprep(public API),per the suggestion from @gpshead in the original review.
Lib/smtplib.py: applieshashlib.saslprep()inauth_plain(),auth_login(), andauth_cram_md5(); switchesauth()encoding from'ascii'to'utf-8'.Lib/test/test_saslprep.py(new): tests for RFC 4013 examples,character mapping, prohibited characters, bidirectional checks, unassigned
code points, and test cases from the MongoDB JS saslprep library.
Lib/test/test_smtplib.py: adds Unicode credentials to the simulatedserver (Devanagari username/password; a password that SASLprep normalises via
NFKC) and exercises all three auth mechanisms with them, including verifying
that SASLprep-equivalent passwords authenticate successfully.
Python/stdlib_module_names.h: registers_saslprep.Doc/library/hashlib.rst: new "String preparation" section documentinghashlib.saslprep().📚 Documentation preview 📚: https://cpython-previews--148692.org.readthedocs.build/